[ntuple] Reduce memory usage of `RPageSinkBuf` [v6.36] #20596

hahnjo · 2025-12-01T16:41:41Z

When IMT is turned on and RPageSinkBuf has an RTaskScheduler, we would previously buffer all pages and create tasks to seal / compress them. While this exposes the maximum work, it's a waste of memory if other threads are not fast enough to process the tasks. Heuristically assume that there is enough work if we already buffer more uncompressed bytes than the approximate zipped cluster size.

In a small test, writing random data with ROOT::EnableImplicitMT(1) and therefore no extra worker thread, the application used 500 MB before this change for the default cluster size of 128 MiB. After this change, memory usage is reduced to around 430 MB (compared to a memory usage of 360 MB without IMT). The compression factor is around ~2.1x in this case, which roughly checks out:
Instead of buffering the full uncompressed cluster (which is around compression factor * zipped cluster size = 270 MiB), we now buffer uncompressed pages up to the approximate zipped cluster size (128 MiB) and then start compressing pages immediately. The result of course also needs to be buffered, but is much smaller after compression: ((1 - 1 / compression factor) * zipped cluster size = 67 MiB). Accordingly, the gain will be higher for larger compression factors.

Closes #18314, backport of #20425

FYI @Dr15Jones @makortel

Created tasks reference *this, so moving is not safe. It's also not needed because RPageSinkBuf is always inside a std::unique_ptr. (cherry picked from commit 672dc1a)

When IMT is turned on and RPageSinkBuf has an RTaskScheduler, we would previously buffer all pages and create tasks to seal / compress them. While this exposes the maximum work, it's a waste of memory if other threads are not fast enough to process the tasks. Heuristically assume that there is enough work if we already buffer more uncompressed bytes than the approximate zipped cluster size. In a small test, writing random data with ROOT::EnableImplicitMT(1) and therefore no extra worker thread, the application used 500 MB before this change for the default cluster size of 128 MiB. After this change, memory usage is reduced to around 430 MB (compared to a memory usage of 360 MB without IMT). The compression factor is around ~2.1x in this case, which roughly checks out: Instead of buffering the full uncompressed cluster (which is around compression factor * zipped cluster size = 270 MiB), we now buffer uncompressed pages up to the approximate zipped cluster size (128 MiB) and then start compressing pages immediately. The result of course also needs to be buffered, but is much smaller after compression: ((1 - 1 / compression factor) * zipped cluster size = 67 MiB). Accordingly, the gain will be higher for larger compression factors. (cherry picked from commit c421df1)

makortel · 2025-12-01T17:35:40Z

Thanks!

github-actions · 2025-12-02T00:34:56Z

Test Results

17 files 17 suites 2d 20h 26m 58s ⏱️
2 749 tests 2 748 ✅ 0 💤 1 ❌
45 154 runs 45 153 ✅ 0 💤 1 ❌

For more details on these failures, see this check.

Results for commit a8721ba.

♻️ This comment has been updated with latest results.

hahnjo added 2 commits December 1, 2025 17:29

[ntuple] Delete move of RPageSinkBuf

3e1ab2b

Created tasks reference *this, so moving is not safe. It's also not needed because RPageSinkBuf is always inside a std::unique_ptr. (cherry picked from commit 672dc1a)

hahnjo requested a review from jblomer December 1, 2025 16:41

hahnjo self-assigned this Dec 1, 2025

hahnjo requested a review from pcanal as a code owner December 1, 2025 16:41

hahnjo added pr:backport in:RNTuple labels Dec 1, 2025

jblomer approved these changes Dec 3, 2025

View reviewed changes

hahnjo merged commit 8a82325 into root-project:v6-36-00-patches Dec 4, 2025
51 of 60 checks passed

hahnjo deleted the ntuple-imt-mem-v636 branch December 4, 2025 07:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ntuple] Reduce memory usage of `RPageSinkBuf` [v6.36] #20596

[ntuple] Reduce memory usage of `RPageSinkBuf` [v6.36] #20596

hahnjo commented Dec 1, 2025

Uh oh!

makortel commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[ntuple] Reduce memory usage of RPageSinkBuf [v6.36] #20596

[ntuple] Reduce memory usage of RPageSinkBuf [v6.36] #20596

Conversation

hahnjo commented Dec 1, 2025

Uh oh!

makortel commented Dec 1, 2025

Uh oh!

github-actions bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[ntuple] Reduce memory usage of `RPageSinkBuf` [v6.36] #20596

[ntuple] Reduce memory usage of `RPageSinkBuf` [v6.36] #20596

github-actions bot commented Dec 2, 2025 •

edited

Loading